62 research outputs found

    Requisite Variety in Ethical Utility Functions for AI Value Alignment

    Get PDF
    Being a complex subject of major importance in AI Safety research, value alignment has been studied from various perspectives in the last years. However, no final consensus on the design of ethical utility functions facilitating AI value alignment has been achieved yet. Given the urgency to identify systematic solutions, we postulate that it might be useful to start with the simple fact that for the utility function of an AI not to violate human ethical intuitions, it trivially has to be a model of these intuitions and reflect their variety − - whereby the most accurate models pertaining to human entities being biological organisms equipped with a brain constructing concepts like moral judgements, are scientific models. Thus, in order to better assess the variety of human morality, we perform a transdisciplinary analysis applying a security mindset to the issue and summarizing variety-relevant background knowledge from neuroscience and psychology. We complement this information by linking it to augmented utilitarianism as a suitable ethical framework. Based on that, we propose first practical guidelines for the design of approximate ethical goal functions that might better capture the variety of human moral judgements. Finally, we conclude and address future possible challenges.Comment: IJCAI 2019 AI Safety Worksho

    Transdisciplinary AI Observatory -- Retrospective Analyses and Future-Oriented Contradistinctions

    Get PDF
    In the last years, AI safety gained international recognition in the light of heterogeneous safety-critical and ethical issues that risk overshadowing the broad beneficial impacts of AI. In this context, the implementation of AI observatory endeavors represents one key research direction. This paper motivates the need for an inherently transdisciplinary AI observatory approach integrating diverse retrospective and counterfactual views. We delineate aims and limitations while providing hands-on-advice utilizing concrete practical examples. Distinguishing between unintentionally and intentionally triggered AI risks with diverse socio-psycho-technological impacts, we exemplify a retrospective descriptive analysis followed by a retrospective counterfactual risk analysis. Building on these AI observatory tools, we present near-term transdisciplinary guidelines for AI safety. As further contribution, we discuss differentiated and tailored long-term directions through the lens of two disparate modern AI safety paradigms. For simplicity, we refer to these two different paradigms with the terms artificial stupidity (AS) and eternal creativity (EC) respectively. While both AS and EC acknowledge the need for a hybrid cognitive-affective approach to AI safety and overlap with regard to many short-term considerations, they differ fundamentally in the nature of multiple envisaged long-term solution patterns. By compiling relevant underlying contradistinctions, we aim to provide future-oriented incentives for constructive dialectics in practical and theoretical AI safety research

    Immoral Programming: What can be done if malicious actors use language AI to launch ‘deepfake science attacks’?

    Get PDF
    The problem-solving and imitation capabilities of AI are increasing. In parallel, research addressing ethical AI design has gained momentum internationally. However, from a cybersecurity-oriented perspective in AI safety, it is vital to also analyse and counteract the risks posed by intentional malice. Malicious actors could for instance exploit the attack surface of already deployed AI, poison AI training data, sabotage AI systems at the pre-deployment stage or deliberately design hazardous AI. At a time when topics such as fake news, disinformation, deepfakes and, recently, fake science are affecting online debates in the population at large but also specifically in scientific circles, we thematise the following elephant in the room now and not in hindsight: what can be done if malicious actors use AI for not yet prevalent but technically feasible ‘deepfake science attacks’, i.e. on (applied) science itself? Deepfakes are not restricted to audio and visual phenomena, and deepfake text whose impact could be potentiated with regard to speed, scope, and scale may represent an underestimated avenue for malicious actors. Not only has the imitation capacity of AI improved dramatically, e.g. with the advent of advanced language AI such as GPT-3 (Brown et al., 2020), but generally, present-day AI can already be abused for goals such as (cyber)crime (Kaloudi and Li, 2020) and information warfare (Hartmann and Giles, 2020). Deepfake science attacks on (applied) science and engineering – which belong to the class of what we technically denote as scientific and empirical adversarial (SEA) AI attacks (Aliman and Kester, 2021) – could be instrumental in achieving such aims due to socio-psycho-technological intricacies against which science might not be immune. But if not immunity, could one achieve resilience? This chapter familiarises the reader with a complementary solution to this complex issue: a generic ‘cyborgnetic’ defence (GCD) against SEA AI attacks. As briefly introduced in Chapter 4, the term cyborgnet (which is much more general than and not to be confused with the term ‘cyborg’) stands for a generic, substrate-independent and hybrid functional unit which is instantiated e.g. in couplings of present-day AIs and humans. Amongst many others, GCD uses epistemology, cybersecurity, cybernetics, and creativity research to tailor 10 generic strategies to the concrete exemplary use case of a large language model such as GPT-3. GCD can act as a cognitively diverse transdisciplinary scaffold to defend against SEA AI attacks – albeit with specific caveats

    Moral Programming: Crafting a flexible heuristic moral meta-model for meaningful AI control in pluralistic societies

    Get PDF
    Artificial Intelligence (AI) permeates more and more application domains. Its progress regarding scale, speed, and scope magnifies potential societal benefits but also ethically and safety relevant risks. Hence, it becomes vital to seek a meaningful control of present-day AI systems (i.e. tools). For this purpose, one can aim at counterbalancing the increasing problem-solving ability of AI with boundary conditions core to human morality. However, a major problem is that morality exists in a context-sensitive steadily shifting explanatory sphere co-created by humans using natural language – which is inherently ambiguous at multiple levels and neither machine-understandable nor machine-readable. A related problem is what we call epistemic dizziness, a phenomenon linked to the inevitable circumstance that one could always be wrong. Yet, while universal doubt cannot be eliminated from morality, it need not be magnified if the potential/requirement for steady refinements is anticipated by design. Thereby, morality pertains to the set of norms and values enacted at the level of a society, other not nearer specified collectives of persons, or at the level of an individual. Norms are instrumental in attaining the fulfilment of values, the latter being an umbrella term for all that seems decisive for distinctions between right and wrong – a central object of study in ethics. In short, for a meaningful control of AI against the background of the changing contextsensitive and linguistically moulded nature of human morality, it is helpful to craft descriptive and thus sufficiently flexible AI-readable heuristic models of morality. In this way, the problem-solving ability of AI could be efficiently funnelled through these updatable models so as to ideally boost the benefits and mitigate the risks at the AI deployment stage with the conceivable side-effect of improving human moral conjectures. For this purpose, we introduced a novel transdisciplinary framework denoted augmented utilitarianism (AU) (Aliman and Kester, 2019b), which is formulated from a meta-ethical stance. AU attempts to support the human-centred task to harness human norms and values to explicitly and traceably steer AI before humans themselves get unwittingly and unintelligibly steered by the obscurity of AI’s deployment. Importantly, AU is descriptive, non-normative, and explanatory (Aliman, 2020), and is not to be confused with normative utilitarianism. (While normative ethics pertains to ‘what one ought to do’, descriptive ethics relates to empirical studies on human ethical decision-making.) This chapter offers the reader a compact overview of how AU coalesces elements from AI, moral psychology, cognitive and affective science, mathematics, systems engineering, cybernetics, and epistemology to craft a generic scaffold able to heuristically encode given moral frameworks in a machine-readable form. We thematise novel insights and also caveats linked to advanced AI risks yielding incentives for future work

    Bodily relations and reciprocity in the art of Sonia Khurana

    No full text
    This article explores the significance of the ‘somatic’ and ‘ontological turn’ in locating the radical politics articulated in the contemporary performance, installation, video and digital art practices of New Delhi-based artist, Sonia Khurana (b. 1968). Since the late 1990s Khurana has fashioned a range of artworks that require new sorts of reciprocal and embodied relations with their viewers. While this line of art practice suggests the need for a primarily philosophical mode of inquiry into an art of the body, such affective relations need to be historicised also in relation to a discursive field of ‘difference’ and public expectations about the artist’s ethnic, gendered and national identity. Thus, this intimate, visceral and emotional field of inter- and intra-action is a novel contribution to recent transdisciplinary perspectives on the gendered, social and sentient body, that in turn prompts a wider debate on the ethics of cultural commentary and art historiography

    On the Development of the Marshall Grazing Incidence X-ray Spectrograph (MaGIXS) Mirrors

    Get PDF
    The Marshall Grazing Incidence X-ray Spectrograph (MaGIXS) is a sounding rocket experiment that will obtain spatially resolved soft X-ray spectra of the solar corona from 0.5 - 2 keV. The optical system comprises a Wolter-I telescope mirror, a slit spectrograph, and a CCD camera. The spectrograph has a finite conjugate paraboloid pair, which re-images the slit, and a varied line-space planar reflection grating. Both the Wolter-I mirror and paraboloid pair are being fabricated at the NASA Marshall Space Flight Center (MSFC), using nickel replication. The MaGIXS mirror mandrels have been diamond turned, polished, and have yielded a set of engineering mirrors. Unlike other grazing incidence instruments, such as FOXSI, ART-XC, and IXPE, the MaGIXS prescriptions have large departure from a cone. This property exacerbates challenges with conventional lap polishing techniques and interferometric metrology. Here we discuss the progression of the optical surfaces of the mandrels through lap polishing, X-ray data from the replicated shells obtained in the MSFC Stray Light Facility (SLF), and our transition to using the ZEEKO computer numerical controlled (CNC) polisher for figure correction
    • …
    corecore